DRAFT DRAFT DRAFT
(Playing along at home? Here’s what you’ll need:)
So, as you may or may not know, Uber and Lyft, two major ridesharing services, stopped offering rides in Austin, TX as of May 9, 2016 after a contentious public vote which you can read about here: http://www.bizjournals.com/austin/news/2016/05/07/uber-lyft-defeated-in-prop-1-referendum.html
A common argument is that the presence of ride-sharing services such as Uber or Lyft decreases the incidence of drunken driving in a city in which they operate; it sounds reasonable enough to say that if there are more options for getting home after drinking, at least one of them might be more attractive than driving home in a potentially compromised state.
There have been a few articles and a couple papers studying the availability of ridesharing services and drunk driving (see references below), but in my opinion they fall short in one or more ways:
I’ve managed to get a hold of three datasets that I think will be helpful:
The APD (Austin Police Department) Incident Reports are publicly available for years 2008-2011, and for ‘Year to Date’ which supposedly includes the last 18 months of reported crime in Austin but actually only includes 8 months of 2016 and a small number of reports from 2015. I was able to obtain an archived version of the data from late 2015, which has more complete data for Jan-Jul 2015.
The APD Chief’s Monthly Report is a high-level summary of crime in Austin for the month, including counts for high-profile crimes including DWI. It has no time-of-day or location information, and no info other than just that a DWI occurred, but is another source of data to compare to the incident reports, which may be spotty. It additionally contains data on narcotics arrests, which we can check to see if we can use it as a control variable - narcotics crimes may follow the same broad trends as DWI, but are potentially unaffected by ridesharing.
It’s worth noting that the study of DWI arrest numbers as a proxy for the incidence of drunk driving is an open area of research - some studies prefer drunk driving crashes, although Austin has not released enough monthly statistics on drunk driving crashes to make it a useful for what I want to do. We’ll assume DWI arrests are a viable proxy for drunk driving here.
The TABC (Texas Alcoholic Beverage Commission) Mixed Beverage Tax is a fixed tax charged on alcoholic beverages sold by a holder of a Mixed Beverage License in the state of Texas. I suspect that Mixed Beverage Tax revenues are associated with Units of Alcohol Consumed in Austin, and we’ll check to see if any relationship exists between these tax revenues and DWI counts.
June 3, 2014: Lyft launches in Austin. (https://twitter.com/AustinLyft/status/473857057720766464)
June 4, 2014 : Uber launches in Austin. (https://twitter.com/Uber_ATX/status/474199740217716736)
May 9, 2016: Uber and Lyft stop giving rides in Austin.
We’ll call June 4 and May 9 treatment dates - the inflection points we’ll investigate to see if the data on either side is substantially different. When dealing with monthly data we’ll have to say that June 2014 is the first month of the treatment and April 2016 is the last month.
You can find the same data I used at these sites:
Or you can pull it from my Github, http://www.github.com/ianwells/atx-dwi
Alright, Let’s get to it.
It’s always a good idea to just poke around in your data and see what’s going on. Here’s a quick glimpse of what the raw APD data looks like:
sample_n(dwi.raw,10)Cool. We have a bunch of crimes, pre-filtered to intoxication-related ones and the time they approximately occurred.
You can check this against the full list of crimes available to commit in Austin, which is available on that same Austin Data Portal site, I’m not including it here because it’s quite long, please trust that I’ve got all of them.
levels(dwi.raw$crime)[1] "CRASH/INTOX MANSLAUGHTER" "CRASH/INTOXICATION ASSAULT"
[3] "DRIVING WHILE INTOX / FELONY" "DUI - AGE 17 TO 20"
[5] "DWI" "DWI .15 BAC OR ABOVE"
[7] "DWI 2ND"
CRASH/INTOX MANSLAUGHTER
CRASH/INTOXICATION ASSAULT
DRIVING WHILE INTOX / FELONY
DUI - AGE 17 TO 20
DWI
DWI .15 BAC OR ABOVE
DWI 2ND
If you’re familiar with this data set you’ll notice I’ve removed a few crimes:
BOATING WHILE INTOXICATED - This rare crime doesn’t seem like it could be avoided by hailing an Uber.DWI - DRUG RECOGNITION EXPERT - You don’t have to have any alcohol in your system to be charged with a DWI: in this case, an officer certified in recognizing the effects of some other substance has decided you shouldn’t be driving. While I imagine you could avoid trouble by hailing a ride in this case, I don’t think I could relate it to bar sales, so this stat is going to be relegated to some other study.CRASH/LEAVING THE SCENE - This is a tricky one. Certainly a non-zero percentage of hit-and-run offenses are committed by drunk drivers, but this charge doesn’t offer any more details, and worse, there are more of this kind of crime than all the others put together. How should we count them, if at all? For now, let’s leave them all out, as they’re not counted in official DWI tallies. We might be able to project how many of these crimes involve alcohol at a later date, based on when and where they occurred, but we’ll leave that for another analysis.Let’s take a look at the relative frequency of these offenses. Periodicity can be exploited later on.
qplot(hour(date), data=dwi.raw, geom="histogram", bins = 24)Perhaps not surprisingly, nearly all DWI offenses occur between 8 PM and 5 AM.
qplot(wday(date), data=dwi.raw, geom="histogram", bins = 7)Again, looks like Fridays and Saturdays are popular days to get caught drunk driving.
We’re going to look at the whole series, and then a histogram of monthly data to see if there’s a seasonal trend (you might expect more DWI’s to occur in December, January, or July months with holiday weekends that involve a lot of drinking)
Note: we have incomplete data for July 2015 and August 2016, as well as an old offense in 2014, so we’ll remove those months now. We’ll leave 2015 and 2016 out of the monthly histogram because we only have data from a few months for those years.
dwi.raw$year <- year(dwi.raw$date)
dwi.raw$month <- month(dwi.raw$date)
dwi.monthly <- dwi.raw %>% group_by(year,month)
dwi.monthly <- summarize(dwi.monthly,count=n())
dwi.monthly$date <- as.Date(paste0(dwi.monthly$year,'-',dwi.monthly$month,'-01'))
dwi.monthly <- dwi.monthly[-56,] #remove partial July 2015 data
dwi.monthly <- dwi.monthly[-63,] #remove partial August 2016 data
dwi.monthly <- dwi.monthly[-49,] #remove lone 2014 arrest
label.months <- c('J','F','M','A','M','J','J','A','S','O','N','D')
ggplot(data=dwi.monthly, aes(x = date, y = count, group = year(date), color = factor(year(date)))) + geom_point()#qplot(factor(month(date)), data=filter(dwi.monthly,date < '2015-01-01'), geom="bar", weight = count)
dummy <- filter(dwi.monthly,date < '2015-01-01') %>% group_by(month(date)) %>% summarize(count = sum(count)) %>% select(count)
dummy1 <- mean(dummy$count) #1824
dummy2 <- sd(dummy$count) #99
ggplot(data = filter(dwi.monthly,date < '2015-01-01'), aes(factor(month(date)),weight = count) ) + geom_bar() + geom_hline(yintercept=1824, color = 'red') +
geom_hline(yintercept=1724, color = 'gray') +
geom_hline(yintercept=1924, color = 'gray')So we can get a sense that, for the years 2008-2011, monthly DWI’s seem to be decreasing (good job APD), and as for a monthly trend we see an annual bump in December-January, something in March, as well as something going on in August-October, although all seem to be within one standard deviation of our monthly mean.
Now, I’m curious to see how the Chief’s Monthly Report compares.
apd.raw$date <- as.Date(apd.raw$date)
apd.raw$year <- year(apd.raw$date)
apd.raw$month <- month(apd.raw$date)
apd.monthly <- apd.raw
ggplot(data=apd.monthly, aes(x = date, y = count, color = factor(year))) + geom_point() #qplot(factor(month(date)), data=apd.monthly, geom="bar", weight = count)
dummy <- apd.monthly %>% group_by(month(date)) %>% summarize(count = sum(count)) %>% select(count)
dummy1 <- mean(dummy$count) #3290
dummy2 <- sd(dummy$count) #145
ggplot(data = apd.monthly, aes(factor(month(date)),weight = count) ) + geom_bar() + geom_hline(yintercept=3290, color = 'red') +
geom_hline(yintercept=3145, color = 'gray') +
geom_hline(yintercept=3435, color = 'gray')This has similar bumps, with a weaker September. Let’s plot both at the same time.
both.monthly <- merge(apd.monthly,dwi.monthly, by = 'date', all = TRUE);
both.monthly$apd <- both.monthly$count.x
both.monthly$dwi <- both.monthly$count.y
both.monthly <- both.monthly %>% select(date,apd,dwi)
both.melted <-melt(both.monthly, id.vars="date")
ggplot(data=both.melted, mapping = aes(x=date,y=value, color = variable)) + geom_point()Warning: Removed 73 rows containing missing values (geom_point).
Okay, so it looks like the chief’s report is a little different than the incident report database. That’s annoying, but based on the disclaimers on both websites, and that the chief’s report is updated more often, I’m going to call it and say the chief’s data is what I’m using moving forward. I’d like those extra two years of data that the database has though - and since the data is so historic, and the difference is so consistent, what I’m going to do is nudge the data up slightly so that it lines up.
both.monthly$diff <- both.monthly$apd - both.monthly$dwi
mean(filter(both.monthly,date < '2012-01-01', date > '2010-01-01')$diff)[1] 14.86957
both.monthly.nudge <- both.monthly
both.monthly.nudge$dwi = both.monthly.nudge$dwi + 15 #approximate mean difference
both.melted.nudge <-melt(both.monthly.nudge, id.vars="date")
ggplot(data=both.melted.nudge, mapping = aes(x=date,y=value, color = variable)) + geom_point()Warning: Removed 145 rows containing missing values (geom_point).
Alright, that’s a little cleaner. We’ll move forward with this set of monthly data until I can get more comprehensive arrest records, being able to use weekly periodicity might be useful.
The Chief’s Monthly Report lists all Part 1 Offenses (according to the Uniform Crime Reporting standards established by the FBI, mostly violent felonies) and a few key Part 2 Offenses, namely DWI, narcotics, prostitution, and weapons-law related crimes. Prostitution and weapons crimes occur sparsely compared to DWI and narcotics, so we’ll import the narcotics arrest data in hopes that it will allow us to control for sweeping effects across all crime categories - things like the number of officers working, motivation of the police force, population of the city, etc.
ggplot(data=apd.monthly, aes(x = date, y = narc, color = factor(year))) + geom_point() qplot(factor(month(date)), data=apd.monthly, geom="bar", weight = narc)Not a lot to say here - there’s a strong January but not a lot else that lines up with the DWI data.
I’ve saved out our final data set here to save time.
dwi <- read.csv('data/dwi-final.csv')
#sapply(dwi,class)
dwi$date <- as.Date(dwi$date)
dwi$count <- as.numeric(as.character(dwi$count)) #cast strictly
dwi$narc <- as.numeric(as.character(dwi$narc))Warning: NAs introduced by coercion
dwi$treatment <- as.factor(dwi$treatment) #0: pre uber, 1:uber, 2:post uberLet’s have a look at what’s going on with the TABC data. Now, I’ve already done a bit of work with this data set in the past (see tabcmap.s3-website-us-east-1.amazonaws.com), so I’ve just pulled values from my database of monthly revenues calculated from monthly tax receipts.
Here’s what a few rows look like:
sample_n(tabc.raw,10)Let’s do the same all-time and then year-over-year plots to get a sense of what’s in there.
There’s a lone data point from 1999, let’s clean that up too, and let’s scale revenues into the millions of dollars.
tabc.monthly <- tabc.raw[-40,] #remove lone 1999 point
tabc.monthly$date <- as.Date(tabc.monthly$date)
tabc.monthly$year <- year(tabc.monthly$date)
tabc.monthly$month <- month(tabc.monthly$date)
tabc.monthly$rev <- round(tabc.monthly$rev/1000000,2)
ggplot(data=tabc.monthly, aes(x = date, y = rev, group = year, color = factor(year))) + geom_point()qplot(factor(month(date)), data=tabc.monthly, geom="bar", weight = rev)Look at that growth! Notice that spike in March? That’s SXSW, a major annual music, technology, and film conference. See that spike in October? That’s Austin City Limits, a major music festival. This data is already good to go, so let’s rename it to keep things clean and keep going.
tabc <- select(tabc.monthly,date,rev)Let’s see about a new variable - DWI per Millions of Dollars Bar Revenue. I think that this variable may be useful as it incorporates things you would expect to vary with bar revenue - drinking-age population and popularity of drinking - into the DWI count. All things the same, a town with a lower DWI-per-drink-consumed number is going better than one with a higher ratio. We may be able to link ridesharing to an improvement in this ratio more convincingly than to just raw DWI’s.
dwi <- merge(dwi,tabc, by = 'date')
dwi$count_per_mm <- dwi$count/dwi$rev
ggplot(data=dwi, aes(x = date, y = count_per_mm, group = year(date), color = factor(year(date)))) + geom_point()ggplot(data=dwi, aes(x = rev, y = count, color = year(date))) + geom_point() + geom_smooth(method = "lm")summary(lm(data = dwi, count ~ rev))
Call:
lm(formula = count ~ rev, data = dwi)
Residuals:
Min 1Q Median 3Q Max
-124.302 -36.895 2.742 37.032 112.456
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 471.91537 21.21465 22.245 <2e-16 ***
rev 0.01024 0.47007 0.022 0.983
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 51.7 on 105 degrees of freedom
Multiple R-squared: 4.519e-06, Adjusted R-squared: -0.009519
F-statistic: 0.0004745 on 1 and 105 DF, p-value: 0.9827
So while DWI/$MM is decreasing over time, it definitely has periods where it’s still dominated by a rising DWI rate, and it would appear that there’s no direct correlation between DWI counts and bar revenue - which makes sense, as they’re both clearly non-stationary.
qplot(factor(month(date)), data=dwi, geom="bar", weight = count_per_mm)This variable does however have a much more obvious yearly peak in January - we may be able to exploit this entangled variable more easily than just the raw DWI, if we can isolate the effect of ridesharing on TABC revenue (which we’ll talk about later).
Alright, so now that we’ve got some data, what can we do with it? How can we establish one way or the other that the advent of ridesharing affected the number of DWI arrests in Austin, or the number of DWI’s per dollar of alcohol sold? There are a few things I’m going to look at.
Typically, when you set out to prove something with science, you sit down and design an experiment. We don’t have the resources to do a double-blind city-scale ridesharing/drunk-driving experiment, so we’ll have to use the next best things: fancy math and historical data.
Regression Discontinuity Design is a method of establishing causality without the benefit of experimental design. The key assumption is that, with regards to a time series, the data on either side of a treatment inflection are otherwise similar - nothing else special has happened. If you can get behind that, you could claim that if there’s a kink in your data, it’s because of the treatment you’re investigating.
Using this technique on time series data can be tricky, especially when you’re working with 8 years of data, because surely something else besides your treatment happened over that time frame. Let’s remind ourselves what the DWI count looks like:
ggplot(data = dwi, mapping = aes(x=date,y=count, color = factor(year(date)))) + geom_point()You can probably see a lot of jumps in the data - from 2009-2011 there’s a rapid rise in DWI, followed by a swift drop and then a leap up again in 2013, and a gradual drop into 2016. We can’t honestly include all that data, so we’re going to limit things to about two years on either side of the first treatment (Uber and Lyft open for business in Austin), and the 8 months we have after the second treatment (Uber and Lyft cease operations).
First, we’ll plot regressions on either side of the treatment and see how they look - with just lm to start and then with a specialized library.
ggplot(data = filter(dwi,date > '2012-05-01'),mapping = aes(date, count, color = treatment) )+ geom_point() + geom_smooth(method = 'lm')Visually, you might be tempted to get a sense of the effect of ridesharing over the period marked in green, but let’s keep going. We’re going to use the rddtools library and see if there is a statistically significant change in the regressions before and after each discontinuity, we’ll also compare rddtools to the output of lm
dwi$index <- seq.int(nrow(dwi))
dwi.w <- filter(dwi, date < '2016-05-01', date > '2012-05-1')
dwi.rdd.w <- rddtools::rdd_data(y = dwi.w$count,x = dwi.w$index,cutpoint = 77)
dwi.wo <- filter(dwi, date < '2016-12-31', date > '2014-05-30')
dwi.rdd.wo <- rddtools::rdd_data(y = dwi.wo$count,x = dwi.wo$index,cutpoint = 100)
lm.w<-lm(count~I(index-77)*treatment,dwi.w)
reg_para.w <- rdd_reg_lm(rdd_object = dwi.rdd.w, order = 1)
lm.wo<-lm(count~I(index-8)*treatment,dwi.wo)
reg_para.wo <- rdd_reg_lm(rdd_object = dwi.rdd.wo, order = 1)
print(reg_para.w)### RDD regression: parametric ###
Polynomial order: 1
Slopes: separate
Number of obs: 47 (left: 23, right: 24)
Coefficient:
Estimate Std. Error t value Pr(>|t|)
D -6.3097 18.7822 -0.3359 0.7386
plot(reg_para.w)summary(lm.w)
Call:
lm(formula = count ~ I(index - 77) * treatment, data = dwi.w)
Residuals:
Min 1Q Median 3Q Max
-71.696 -18.462 -6.967 17.833 76.336
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 496.9667 12.6479 39.292 <2e-16 ***
I(index - 77) -1.3000 0.9423 -1.380 0.175
treatment1 -13.1801 18.6991 -0.705 0.485
I(index - 77):treatment1 -0.3561 1.3773 -0.259 0.797
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 31.95 on 43 degrees of freedom
Multiple R-squared: 0.4199, Adjusted R-squared: 0.3794
F-statistic: 10.38 on 3 and 43 DF, p-value: 2.921e-05
For the first treatment, RDD tells us that the drop is small enough relative to the variance in the data that any bump and change in the slope of the regression is too small to be attributed to anything other than randomness - at least when modeled with a straight line. We’ll consider that a straight line isn’t the best way to fit this data later on - remember we spent some time investigating periodic patterns.
print(reg_para.wo)### RDD regression: parametric ###
Polynomial order: 1
Slopes: separate
Number of obs: 30 (left: 22, right: 8)
Coefficient:
Estimate Std. Error t value Pr(>|t|)
D -64.579 23.084 -2.7975 0.009565 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(reg_para.wo)summary(lm.wo)
Call:
lm(formula = count ~ I(index - 8) * treatment, data = dwi.wo)
Residuals:
Min 1Q Median 3Q Max
-71.696 -18.593 -8.336 22.383 76.336
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 598.059 83.895 7.129 1.43e-07 ***
I(index - 8) -1.656 1.032 -1.604 0.121
treatment2 -782.631 601.778 -1.301 0.205
I(index - 8):treatment2 8.120 6.291 1.291 0.208
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 32.84 on 26 degrees of freedom
Multiple R-squared: 0.2247, Adjusted R-squared: 0.1353
F-statistic: 2.512 on 3 and 26 DF, p-value: 0.08065
For the second treatment, the jump in the regression is more pronounced and the slope changes wildly - our P values are low enough to conclude that the change may be due to the discontinuity in May 2016 and not random chance. This is a point in favor of ridesharing, but we’re not out of techniques yet.
Our regression discontinuity design fit 1st degree polynomials to the DWI data, but what if the behavior is more complicated than that? We observed earlier that there may be a periodic component to the data, what if ridesharing disrupts busy DWI months but not slower months, would a simple linear regression uncover that?
Exponential Smoothing is a class of methods for predicting future values in a time series based on manipulations of past values. Holt-Winters Forecasting is a method of predicting time-series data that has strong periodic components by separately smoothing long-term trend, short-term seasonality, and local components to create a forecast that takes into account predictable variations in the data. We saw above that our DWI data probably has some yearly trends in it - the data shows it a little, and you can Google around for articles about DWI in Austin and see for yourself that January is a special month for the police (look for ‘no-refusal weekend’). We’ll decompose the DWI and DWI/$ really quick and you can see for yourself:
plot(stl(ts(dwi$count, start = c(2008, 1), frequency = 12), s.window="periodic"))plot(stl(ts(dwi$count_per_mm, start = c(2008, 1), frequency = 12), s.window="periodic"))You can see seasonality contributes to a swing of 70 DWI arrests, or 2.5 arrests per $MM.
What I’m going to do is feed DWI data from January 2008 - May 2014 (the last month before Uber and Lyft were widely available in Austin) into a Holt-Winters model and forecast DWI counts for the next few months (we can roll the data back to 2008 again because we’re not concerned with local effects so much as trends inherent in DWI counts). I’m then going to compare those to the actual data, with the reasoning that since the model takes into account yearly cycles and broad trends, it will come up with a prediction for DWI counts as if ridesharing never appeared. We can call the difference the number of DWI’s ridesharing prevented.
Let’s get confident in this method - first I’m going to take some untreated months and see if we can predict them reliably.
dwi.pre <- filter(dwi,date < '2013-10-31')
dwi.post <- filter(dwi,date > '2013-10-31')
dwi.pre.count.ts <- ts(dwi.pre$count, start = c(2008, 1), frequency = 12)
dwi.post.count.ts <- ts(dwi.post$count, start = c(2013, 11), frequency = 12)
dwi.pre.count.hw <- HoltWinters(dwi.pre.count.ts)
forecast <- predict(dwi.pre.count.hw, n.ahead = 6, prediction.interval = T, level = 0.9)
plot(dwi.pre.count.hw, forecast); lines(dwi.post.count.ts,col='green')effect <- forecast - dwi.post.count.ts
pct.effect <- cumsum(effect)/cumsum(forecast)
pct.effect[6][1] 0.01460054
Not bad. The model is a little rough - it fits it approximately but fails to catch all the peaks - the data may not have been periodic enough for this model to be effective, or there may be too much non-seasonal noise. However, using the data from January 2008 to October 2013, it looks like we can predict cumulative DWI’s for the next 6 months within 1.5%. That should make us feel better about the numbers coming out, with the caveat that cumulative DWI predictions over longer time frames may be more accurate than picking individual months. Let’s move the end of the training data to May 2014 (the last month without Uber and Lyft) and see how things look.
dwi.pre <- filter(dwi,date < '2014-05-31')
dwi.post <- filter(dwi,date > '2014-05-31')
dwi.pre.count.ts <- ts(dwi.pre$count, start = c(2008, 1), frequency = 12)
dwi.post.count.ts <- ts(dwi.post$count, start = c(2014, 6), frequency = 12)
dwi.pre.count.hw <- HoltWinters(dwi.pre.count.ts)
forecast <- predict(dwi.pre.count.hw, n.ahead = 24, prediction.interval = T, level = 0.9)
plot(dwi.pre.count.hw, forecast); lines(dwi.post.count.ts,col='green')effect <- forecast - dwi.post.count.ts
pct.effect <- cumsum(effect)/cumsum(forecast)
pct.effect[12][1] 0.07247797
pct.effect[3][1] 0.01613347
For the first 12 months Uber and Lyft were available in Austin, the real cumulative DWI count was 440 arrests less than the predicted count - an effect of about 7% (over the whole 24-month period, we see 8%). For the first 3 months, where our error bars are much tighter (and perhaps while Uber and Lyft were still growing in capacity), we’re only 24 DWI’s less than predicted - a decrease of 1.6%.
We can try a potentially better-fitting model from the exponential smoothing family of models with ets.
dwi.pre.count.ets <- ets(dwi.pre.count.ts, model = 'ZZZ')
forecast.ets <- predict(dwi.pre.count.ets, n.ahead = 24, prediction.interval = T, level = 0.9)
plot(forecast.ets); lines(dwi.post.count.ts,col='green')effect.ets <- forecast.ets$mean - dwi.post.count.ts
pct.effect.ets <- cumsum(effect.ets)/cumsum(forecast.ets$mean)
pct.effect.ets[12][1] 0.08457409
pct.effect.ets[3][1] 0.05609513
With much tighter margin of error, ets suggests a drop of 8.5% over 12 months (9.7 over the full 24) and 5.6% over 3 months.
I’d like to look at our DWI per Revenue variable, but first we have to answer, how much did ridesharing affect bar revenues? If it’s a lot, we’ll have a hard time attributing a change in DWI/$ to a change in DWI’s.
dwi.pre.rev.ts <- ts(dwi.pre$rev, start = c(2008, 1), frequency = 12)
dwi.post.rev.ts <- ts(dwi.post$rev, start = c(2014, 6), frequency = 12)
dwi.pre.rev.ets <- ets(dwi.pre.rev.ts, model = 'ZZZ')
forecast.ets <- predict(dwi.pre.rev.ets, n.ahead = 24, prediction.interval = T, level = 0.9)
plot(forecast.ets); lines(dwi.post.rev.ts,col='green')effect.ets <- forecast.ets$mean - dwi.post.rev.ts
pct.effect.ets <- cumsum(effect.ets)/cumsum(forecast.ets$mean)
pct.effect.ets[3][1] -0.02926871
Looks like we see a ~3% rise in TABC sales due to ridesharing - which seems easy to reason around, if you don’t have to worry about driving you could be drinking more.
dwi.pre.count_per_mm.ts <- ts(dwi.pre$count_per_mm, start = c(2008, 1), frequency = 12)
dwi.post.count_per_mm.ts <- ts(dwi.post$count_per_mm, start = c(2014, 6), frequency = 12)
dwi.pre.count_per_mm.ets <- ets(dwi.pre.count_per_mm.ts, model = 'ZZZ')
forecast.ets <- predict(dwi.pre.count_per_mm.ets, n.ahead = 24, prediction.interval = T, level = 0.9)
plot(forecast.ets); lines(dwi.post.count_per_mm.ts,col='green')effect.ets <- forecast.ets$mean - dwi.post.count_per_mm.ts
pct.effect.ets <- cumsum(effect.ets)/cumsum(forecast.ets$mean)
pct.effect.ets[12][1] 0.1334951
pct.effect.ets[3][1] 0.1029293
ETS tells us a whopping 13% drop over 12 months (10% over 3) in DWI/$MM is attributable to ridesharing - well beyond the change we saw from just the rise in $MM.
Let’s try the other side - fit a model from data starting from Uber’s first month, and then stopping on Uber’s last month. If the prediction is lower than the real data, we might be able to say that ridesharing was having an effect, but there was some other factor contaminating the data from 2008-2014 (say an organized campaign against drunk driving or something).
We’re going to cheat a tiny bit because we need another sample of data (Holt-Winters requires 2 full years of data here) - so I’m going to say Uber actually opened for business a month earlier than it did. Hopefully the first month was relatively slow as people started adopting to ridesharing, and it won’t make much of a difference to stretch the effect out over one more month in addition the 23 months Uber and Lyft were in Austin.
dwi.during <- dwi %>% filter(date >= '2014-5-01') %>% filter(date < '2016-05-01')
dwi.exit <- filter(dwi,date >= '2016-05-01')
dwi.during.count.ts <- ts(dwi.during$count, start = c(2014, 5), frequency = 12)
dwi.exit.count.ts <- ts(dwi.exit$count, start = c(2016, 5), frequency = 12)
dwi.during.count.hw <- HoltWinters(dwi.during.count.ts)
forecast <- predict(dwi.during.count.hw, n.ahead = 12, prediction.interval = T, level = 0.9)
plot(dwi.during.count.hw, forecast); lines(dwi.exit.count.ts,col='green')effect <- forecast - dwi.exit.count.ts
pct.effect <- cumsum(effect)/cumsum(forecast)Warning in cumsum(effect)/cumsum(forecast): longer object length is not a
multiple of shorter object length
pct.effect[7][1] -0.04835785
pct.effect[3][1] -0.06414393
This looks better - the model fits better, the confidence interval is tighter, and the treatment effect is 80 DWI’s over the first 3 months, a decrease of about 6%. In other words, not having Uber and Lyft may have resulted in an additional 80 arrests. After 7 months (taking us to the end of our data), we observe a smaller 5% decrease - perhaps we can attribute this to the smaller ridesharing services (like Fare) that sprung up in Austin after Uber and Lyft stopped offering rides.
Let’s do the same with DWI/$MM:
dwi.during.rev.ts <- ts(dwi.during$rev, start = c(2014, 5), frequency = 12)
dwi.exit.rev.ts <- ts(dwi.exit$rev, start = c(2016, 5), frequency = 12)
dwi.during.rev.hw <- HoltWinters(dwi.during.rev.ts)
forecast <- predict(dwi.during.rev.hw, n.ahead = 12, prediction.interval = T, level = 0.9)
plot(dwi.during.rev.hw, forecast); lines(dwi.exit.rev.ts,col='green')effect <- forecast - dwi.exit.rev.ts
pct.effect <- cumsum(effect)/cumsum(forecast)Warning in cumsum(effect)/cumsum(forecast): longer object length is not a
multiple of shorter object length
pct.effect[7][1] 0.07915371
pct.effect[3][1] 0.06589074
Bad news for bars - Holt-Winters suggests that revenues were depressed by 6.5% over the first 3 months without Uber and Lyft.
dwi.during.count_per_mm.ts <- ts(dwi.during$count_per_mm, start = c(2014, 5), frequency = 12)
dwi.exit.count_per_mm.ts <- ts(dwi.exit$count_per_mm, start = c(2016, 5), frequency = 12)
dwi.during.count_per_mm.hw <- HoltWinters(dwi.during.count_per_mm.ts)
forecast <- predict(dwi.during.count_per_mm.hw, n.ahead = 12, prediction.interval = T, level = 0.9)
plot(dwi.during.count_per_mm.hw, forecast); lines(dwi.exit.count_per_mm.ts,col='green')effect <- forecast - dwi.exit.count_per_mm.ts
pct.effect <- cumsum(effect)/cumsum(forecast)Warning in cumsum(effect)/cumsum(forecast): longer object length is not a
multiple of shorter object length
pct.effect[7][1] -0.1268825
pct.effect[3][1] -0.1481808
14% higher DWI/$MM over the first 3 months without Uber, or 12% through 2016, again exceeding the effect of not having ridesharing on revenue alone.
We’ve got one more trick up our sleeves.
Google has a great library for testing the causal effect of a given treatment on a time series. Designed for testing the effectiveness of advertising, I think this library will help us out - it’s designed for inference on non-experimental data, which is what we’ve got.
For this, in contrast to regression discontinuity or exponential smoothing methods, we require a response and predictor - one thing that we think a treatment affected and one thing that we are confident it didn’t. let’s use DWI count as a response, and TABC revenue as a control, as it appears TABC sales growth is less affected than DWI, per our experiments in the previous section, and also narcotic arrests.
We need a well-behaved stretch of data to use as a control - so let’s re-use the same cutoff we used for our regression discontinuity design (2012-05-01). Let’s plot it to see if it it looks like anything strange happened to our control data over our period of interest.
uber <- geom_rect(mapping = aes(xmin=as.Date('2014-06-01'), xmax = as.Date('2016-05-01')), ymin=-Inf, ymax=+Inf, color="grey20", alpha=0.3, inherit.aes = FALSE)
ggplot(data=melt(select(filter(dwi,date>="2012-04-01"),narc,count,rev,date),id = 'date'), mapping = aes(x=date, y=value, color = variable)) + uber + geom_point() + geom_smooth(method = 'loess', se = FALSE)Not the best, but it’s what we’ve got. I’m especially concerned by the trend in TABC revenue and the little uptick in narcotics arrests in late 2016. If CausalImpact doesn’t like it it should tell us, we’ll see how it goes.
(If you’re running this at home note that you’ll need the development version of devtools to install CausalImpact due to a bug in the CRAN version of devtools. Get it with devtools::install_github("hadley/devtools") )
dwi.ci <- select(filter(dwi,date>="2012-04-01"),date,count,rev)
pre.ci <- as.Date(c("2012-04-01", "2014-05-31"))
post.ci <- as.Date(c("2014-06-01", "2016-05-01"))
impact <- CausalImpact(dwi.ci, pre.ci, post.ci, model.args = list( prior.level.sd = 0.1,nseasons = 12, season.duration = 1))
plot(impact)impactPosterior inference {CausalImpact}
Average Cumulative
Actual 463 11114
Prediction (s.d.) 515 (19) 12370 (457)
95% CI [478, 551] [11477, 13218]
Absolute effect (s.d.) -52 (19) -1256 (457)
95% CI [-88, -15] [-2104, -363]
Relative effect (s.d.) -10% (3.7%) -10% (3.7%)
95% CI [-17%, -2.9%] [-17%, -2.9%]
Posterior tail-area probability p: 0.002
Posterior prob. of a causal effect: 99.8%
For more details, type: summary(impact, "report")
CausalImpact attributes a 10% drop in DWI’s to ridesharing, but can we trust it? Our control data has trends in it. If we give CasualImpact a ‘fake’ treatment date when we know nothing happened - say, in the middle of 2013 - it shouldn’t report any change in DWI due to this made-up treatment.
dwi.ci <- select(filter(dwi,date>="2012-04-01"),date,count,rev)
pre.ci <- as.Date(c("2012-04-01", "2013-09-30"))
post.ci <- as.Date(c("2013-10-01", "2014-05-01"))
impact <- CausalImpact(dwi.ci, pre.ci, post.ci, model.args = list( prior.level.sd = 0.1,nseasons = 12, season.duration = 1))
plot(impact)Uh oh, an 4% drop. We may not be able to use these results, at least with TABC revenues as a control. The upward trend in revenues through our treatment period and the downward trend in DWI may be distorting the model and may always tell us there’s a drop in DWI arrests.
Since we were worried TABC revenues are a bad control variable, let’s see if narcotics arrests are better.
dwi.ci.n <- filter(select(dwi,date,count,narc),date >= '2012-04-01')
pre.ci <- as.Date(c("2012-04-01", "2014-05-31"))
post.ci <- as.Date(c("2014-06-01", "2016-05-09"))
impact.n <- CausalImpact(dwi.ci.n, pre.ci, post.ci,model.args = list(prior.level.sd = 0.1,nseasons = 12, season.duration = 1))
plot(impact.n)impact.nPosterior inference {CausalImpact}
Average Cumulative
Actual 463 11114
Prediction (s.d.) 488 (18) 11716 (433)
95% CI [452, 525] [10855, 12606]
Absolute effect (s.d.) -25 (18) -602 (433)
95% CI [-62, 11] [-1492, 259]
Relative effect (s.d.) -5.1% (3.7%) -5.1% (3.7%)
95% CI [-13%, 2.2%] [-13%, 2.2%]
Posterior tail-area probability p: 0.084
Posterior prob. of a causal effect: 92%
For more details, type: summary(impact, "report")
Using narcotics arrests as a control, ridesharing can account for a 7% drop in DWI arrests. Let’s double check on a fake treatment date:
dwi.ci.n <- filter(select(dwi,date,count,narc),date >= '2012-04-01')
pre.ci <- as.Date(c("2012-04-01", "2014-09-30"))
post.ci <- as.Date(c("2014-10-01", "2016-05-09"))
impact.n <- CausalImpact(dwi.ci.n, pre.ci, post.ci,model.args = list(prior.level.sd = 0.1,nseasons = 12, season.duration = 1))
plot(impact.n)impact.nPosterior inference {CausalImpact}
Average Cumulative
Actual 462 9247
Prediction (s.d.) 479 (17) 9583 (331)
95% CI [445, 510] [8892, 10203]
Absolute effect (s.d.) -17 (17) -336 (331)
95% CI [-48, 18] [-956, 355]
Relative effect (s.d.) -3.5% (3.5%) -3.5% (3.5%)
95% CI [-10%, 3.7%] [-10%, 3.7%]
Posterior tail-area probability p: 0.155
Posterior prob. of a causal effect: 84%
For more details, type: summary(impact, "report")
Same impact. This control looks like it’s no good either.
Our conclusions were a bit mixed, but promising. Investigating with RDD suggested that there might be an intervention effect due to Uber and Lyft ceasing ridesharing in Austin, and exponential smoothing techniques suggested measurable drops in DWI counts and DWI counts per MM bar revenue due to the introduction of ridesharing (as well as a small uptick in bar revenue), as well as a subsequent increase in DWI counts and DWI per MM after Uber and Lyft stopped offering rides, once you took into account broad trends and periodic cycles in both data sets.
CausalImpact analysis didn’t work out to be something we could use - both TABC revenue and narcotics arrests data weren’t suitable controls.
Taking into account our results, the models suggest attributing a 5-10% drop in DWI in Austin from June 2014 - May 2016 to ridesharing, and a 0-5% increase in DWI from May 2016 - December 2016 to an abrupt cessation of Uber and Lyft’s ridesharing services.
With more fine-grained time series data, or some certified-extra-accurate data from the APD, I think we could improve on this analysis.
Deterring Drunk Driving Fatalities: An Economics of Crime Perspective Bruce L. Benson and David W. Rasmussen
Inferring Causal Impact Using Bayesian Structural Time-series Models Kay H. Brodersen, Fabian Gallusser, Jim Koehler, Nicolas Remy, Steven L. Scott
Regression Discontinuity Designs in Economics David S. Lee and Thomas Lemieux
The Austin Police Department does not assume any liability for any decision made or action taken or not taken by the recipient in reliance upon any information or data provided.
The Comptroller of Public Accounts cannot vouch for the data or analysis derived from data after it has been retrieved from the Comptroller’s Web site.
The MIT License (MIT)
Copyright (c) 2017 Ian Wells
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
DRAFT DRAFT DRAFT